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Abstract 

We determine the uncertainties on observables arising from the errors on the experi¬ 
mental data that are fitted in the global MRST2001 parton analysis. By diagonalizing the 
error matrix we produce sets of partons suitable for use within the framework of linear 
propagation of errors, which is the most convenient method for calculating the uncertain¬ 
ties. Despite the potential limitations of this approach we find that it can be made to 
work well in practice. This is confirmed by our alternative approach of using the more 
rigorous Lagrange multiplier method to determine the errors on physical quantities di¬ 
rectly. As particular examples we determine the uncertainties on the predictions of the 
charged-current deep-inelastic structure functions, on the cross-sections for W production 
and for Higgs boson production via gluon-gluon fusion at the Tevatron and the LHC, on 
the ratio of W~ to W~^ production at the LHC and on the moments of the non-singlet 
quark distributions. We discuss the corresponding uncertainties on the parton distribu¬ 
tions in the relevant x, domains. Finally, we briefly look at uncertainties related to the 
fit procedure, stressing their importance and using aw, and extractions of as{M‘^) as 
examples. As a by-product of this last point we present a slightly updated set of parton 
distributions, MRST2002. 


^Royal Society University Research Fellow. 



1 Introduction 


Recently, much attention has been focused on uncertainties associated with the parton distri¬ 
butions that are determined in the next-to-leading order (NLO) global analyses of a wide range 
of deep inelastic and related scattering data. There are many sources of uncertainty, but they 
can be divided into two classes: those which are associated with the experimental errors on the 
data that are htted in the global analysis and those which are due to what can loosely be called 
theory errors. In this latter category we have uncertainties due to (i) NNLO and higher-order 
DGLAP contributions, (ii) effects beyond the standard DGLAP expansion, such as extra In a; 
and ln(l — x) terms, higher twist and saturation contributions, (iii) the particular choice of the 
parametric form of the starting distributions, (iv) heavy target corrections, (v) model assump¬ 
tions, such as s = s. In order to estimate some of these ‘theory’ errors, we can also look at the 
uncertainties arising from different choices of the data cuts (ITcut, a;cut, Qlut)^ dehned such that 
data with values of W, x or below the cut are excluded from the global £t. This approach 
indicates where the current theory is struggling to £t the data compared to other regions. 

Here we study the uncertainties due to the errors on the data, and leave the discussion of 
the ‘theory’ uncertainties to a second paper. Other groups [l]-[7] have also concentrated on the 
experimental errors and have obtained estimates of the uncertainties on parton distributions 
within a NLO QGD framework, using a variety of competing procedures. Of course, the parton 
distributions are not, themselves, physical quantities. However, using the standard approach of 
the linear propagation of errors, these uncertainties of the parton distributions can be translated 
into uncertainties on observables. Therefore, we first follow the general approach in [4] and [5], 
the Hessian method, and diagonalize the error matrix, parameterizing an increase in of the 
fit in terms of a quadratic function of the variation of the parameters away from their best 
fit values. This gives us a number of sets of partons with variations from the minimum in 
orthogonal directions which can be used in a simple manner to calculate the uncertainty on any 
physical quantity. However, this approach depends for its reliability on the assumption that 
the quadratic dependence on the variation of the parton parameters is very good. We hnd that 
this approximation, with some modifications of the precise framework, i.e. the elimination of 
some parameters and rescaling of others, can be made to work well. We make available 30 sets 
of partons - 2 for each of the 15 eigenvector directions in parton parameter space - which can 
be used to calculate the uncertainties on any physical quantity. 

Despite its convenience, the Hessian approach does suffer from some problems if one looks 
at it in detail, and if one tries to extrapolate results, in particular if we consider large increases 
in x^- If is also not, in principle, the most suitable method when allowing as to vary as one 
of the free parameters in the fit. Hence, in this paper we also investigate the uncertainties on 
observables directly. In order to do this we apply the Lagrange multiplier method [8] to the 
observables themselves, therefore avoiding some of the approximations involved in the linear 
propagation of errors from partons to the observables, and conhrming that these approximations 
do not usually cause serious problems. When using this Lagrange multiplier approach, the 
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resulting sets of parton distributions, which correspond to the extreme values of each observable, 
can to a certain extent be thought of as the maximum allowed variation of the dominant 
contributing partons in the relevant kinematic {x, Q"^) domain. We select observables which are 
particularly relevant for experiments at present and future colliders, and which illustrate the 
uncertainties on specific partons in a variety of kinematic {x, domains. In order to determine 
the true uncertainty on quantities we also let as{M^) vary along with the parameters describing 
the parton distributions directly, which is easy to implement in this approach. Some quantities 
are then far more sensitive to 0 : 5 (M|) than others. Fortunately our global £t [9] produces a 
value of asiM"^) which is consistent with the world average, with the same type of error, i.e., 
as{Mz) = 0.119 ± 0.002. Hence, it is completely natural to simply let as^M^) vary as a free 
parameter in the fit in the same way as all the other parameters when determining uncertainties. 
However, we also perform an investigation of the uncertainties with as{M‘^) fixed at 0.119 in 
order to study more directly which variations in the parton distributions are responsible for 
extreme variations in given physical quantities, and to compare with the results of the Hessian 
approach. 

The physical observables that we select as examples in this introductory study are, first, 
the charged-current structure functions F 2 ^{e^p) for deep inelastic scattering at high x and 

at HERA. These observables almost directly represent the d, u valence quarks at high x 
and where deep inelastic data do exist [10]-[12], but have errors of 25% or more at present. 
The precision on these data is expected to increase dramatically in the near future. Second, 
we determine the uncertainties on the cross-sections aw and an, for W boson production and 
for the production of a Higgs boson of mass^ Mh = 115 GeV by gluon fusion respectively, at 
Tevatron and LHC energies. The cross section aw is sensitive to the sea quarks (and also, at 
the Tevatron energy, weakly sensitive to the valence quarks) in a range of rapidity centered 
about X ~ Mw/\/si and for ~ Similarly, an is sensitive to the gluon distribution in 

the domain x ~ Mh/^/s and ~ Mjj. 

As a third example we determine the uncertainty on the ratio of W~ to production at 
the LHC energy. This ratio is expected to be extremely accurately measured in the LHC exper¬ 
iments. Other relevant examples, which we study, are the uncertainties of the moments of the 
non-singlet {u-d) quark distributions. These are quantities for which lattice QCD predictions 
are becoming available, see, for example. Refs. [13, 14]. 

The same techniques can be easily and quickly applied to a wide variety of other physical 
processes sensitive to different partons and different domains. Besides giving a direct evaluation 
of the uncertainties on the observables, we can, in principle, unfold this information to map out 
the uncertainties on NLO partons over the whole kinematic domain where perturbative QCD 
is applicable. 

The plan of the paper is as follows. In Section 2 we discuss the Hessian method, and 
outline our extraction of different parton distribution sets using this approach. In particular 

^There is nothing special about the choice of 115 GeV. We may choose different values in order to probe the 
gluon in different x, domains. 
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we highlight the problems encountered, and how they are dealt with in order to obtain reliable 
results. We make the sets of partons obtained publicly available. In Section 3 we briefly recall 
the elements of the Lagrange multiplier method. In the following four sections we determine 
the uncertainties of the observables that we have mentioned above. This will involve a series 
of global hts in which the observables are constrained at different values in the neighbourhood 
of their values obtained in the optimum global £t. In each case we explore, and discuss, the 
allowed variation of the dominantly contributing partons. Using this more rigorous method 
we also confirm the general appropriateness of the Hessian approach, but discuss where it can 
start to break down. 

Finally, in Section 8, we summarize and briefly investigate the uncertainties associated with 
the initial assumptions made in performing the global fit. In order to do this we compare 
the W and H boson predictions with those obtained using both a slightly updated set of our 
own partons, MRST2002, and using the CTEQ6 partons [5]. (All the results in Sections 2-7 
are based on MRST2001 partons [9].) We find that for the comparison with CTEQ some of 
the variations in predictions are surprisingly large. We also illustrate the same result for the 
extractions of asiM"^) by various different groups. This implies that uncertainties involved with 
initial assumptions and also with theoretical corrections can be more important than those due 
to errors on the data. 


2 The Hessian method 


The basic procedure involved in this approach is discussed in detail in [4], but we briefly 
introduce the important points here. In this method one assumes that the deviation in 
for the global ht^ from the minimum value Xo quadratic in the deviation of the parameters 
specifying the input parton distributions, a*, from their values at the minimum, a°. In this case 
we can write 

n n 

= X^ = - a^){aj - aj), (1) 

i=i j=i 

where Hij is an element of the Hessian matrix, and n is the number of free input parameters. 
In this case the standard linear propagation of errors allows one to calculate the error on any 
quantity F using the formula 


(AF)2 


n n or 


( 2 ) 


where Cij{a) = {H~^)ij is the covariance, or error matrix of the parameters, and is the 
allowed variation in Hence, in principle, once one has either the Hessian or covariance 
matrix (and a suitable choice of Ay^) one can calculate the error on any quantity. 

^The data that are fitted can be found in Refs. [6, 10, 11] and [15]-[29]. We treat the errors as in [9]. 
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However, as demonstrated in [4], it more convenient and more numerically stable to diag¬ 
onalize either the Hessian or covariance matrix, and work in terms of the eigenvectors. Since 
the Hessian and covariance matrices are symmetric they have a set of orthogonal eigenvectors 
dehned by 

n 

^k^ik- (3) 

i=i 

Moreover, because variations in some directions in parameter space lead to deterioration in 
the quality of the £t far more quickly than others, the eigenvalues Xk span several orders of 
magnitude. Hence it is helpful to work in terms of the rescaled eigenvectors = y/XkVik- Then 
the parameter displacements from the minimum may be expressed as 

n 

Atti = (oi - = J2 ^ikZk, (4) 

k=l 

or using the orthogonality of the eigenvectors 


Zi (-^j) 'y ('kiAcijs, (5) 

k=l 

i.e., the Zi are the appropriately normalized combinations of the Aa^ which define the orthogonal 
directions in the space of deviation of parton parameters. In practice a Zi is often dominated 
by a single Aa^.^ 

The error determination becomes much simpler in terms of the Zi. The increase in is 

n 

= ( 6 ) 

2=1 


i.e., the surface of constant is a hyper-sphere of given radius in ; 2 -space. Similarly the error 
on the quantity F is now 


AF=^Ax^ y: 




{^i\dziJ \ 


1/2 


(7) 


Thus it is convenient to introduce parton sets for each eigenvector direction, i.e., from 
Eq. (4) we define 


Aai(S'^) = ±teik, 


( 8 ) 


where the tolerance t is defined by t = v^Ay^ and Ay^ is the allowed deterioration in fit quality 
for the error determination. Then, assuming the quadratic behaviour of F about the minimum, 
(7) becomes the simple expression 


(AF) 


1 

2 


n 2 

y (F(Sp - F(SZ)) 


1 

2 


(9) 


^CTEQ have even implemented the diagonalization procedure in the fitting procedure itself in order to 
improve numerical stability [30]. We do not think this will have effects significant enough to outweigh the 
inherent errors in the Hessian approach described below. 
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If everything were ideal this framework would provide us with a simple and efficient method 
for calculating the uncertainty due to experimental errors on any quantities, where we would 
use the standard choice of = 1. However, the real situation is not so simple, and there are 
two major complications we must overcome in order to obtain reliable results. 

Although, in principle, the la uncertainty in any cross-section should be given by = 1, 
the complicated nature of the global htting procedure, where a large number of independent 
data sets are used, results in this being an unrealistically small uncertainty [31]. This is un¬ 
doubtedly due to some failure of the theoretical approximation to work absolutely properly 
over the full range of data, which introduces the type of theoretical errors outlined in the In¬ 
troduction, and also due to some sources of experimental error not being precisely quantihed. 
Both problems are in practice extremely difficult to surmount. We shall implicitly ignore the 
potential theoretical error in this paper, but account for the lack of ideal behaviour in the 
framework by determining the uncertainties using a larger Ay^. We estimate Ay^ = 50 to be a 
conservative uncertainty (perhaps of the order of a 90% conhdence level or a little less than 2a) 
due to the observation that an increase of 50 in the global which has a value = 2328 for 
2097 data points, usually signihes that the £t to one or more data sets is becoming unaccept¬ 
ably poor. We find that an increase Ay^ of 100 normally means that some data sets are very 
badly described by the theory. Though this estimation does not rely on any real mathematical 
foundation we do not think it is any less valid than the approaches used in e.g. [5] or [1, 7], both 
of which ultimately appeal to some value judgment rather than using all available information 
in a statistically rigorous manner, and ultimately give similar results. The approaches [2, 3, 6] 
do use Ay^ = 1 but either rely on much smaller and more internally compatible data sets, or 
in some cases have rather small errors. 

The second complication is the breakdown of the simple quadratic behaviour in terms of 
variations of the parameters, i.e., the fact that Eq. (1) may receive significant corrections 
and the simple linear propagation of errors is therefore not accurate. Of course, we expect 
some deviations from this simple form for very large Ay^, but unfortunately very significant 
deviations can occur for relatively small Ay^, as outlined below. Due to the very large amount 
of data in our global fit, we have a lot of parameters in order to allow sufficient flexibility 
in the form of the parton distributions. Each of the valence quarks and the total sea quark 
contribution are parameterized in the form 

xq{x, QI) = A{1 — x)^{l + ex^'^ + 'yx)x^, (10) 

where for the valence quarks the normalization A is set by the number of valence quarks of 
each type. Because we find it necessary to have a negative input gluon at low x the gluon 
parameterization has been extended to 

xg{x, Ql) = ^^(l - a;)’^®(l -F egX^'^ + ^gx)x^^ - A_(l - x)'^-x~^-, (11) 

where Ag is determined by the momentum sum rule, and rj- can be set to some fixed large value, 
e.g. 10 or 20, so that the second term only influences large x. The combination Ag = u — d 
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has a slightly different parameterization, i.e., 

xAq{x, QI) = A{1 — x)^{l + 'yx + 5x^)x^. (12) 

Overall, this gives 24 free parameters. In our standard hts we allow all these parameters to vary. 
However, when investigating in detail the small departures from the global minimum we notice 
that a certain amount of redundancy in parameters leads to potentially disastrous departures 
from the behaviour in Eq. (1). For example, in the negative term in the gluon parameterization 
very small changes in the value of 5- can be compensated almost exactly by a change in 
and (to a lesser extent) in the other gluon parameters over the range of x probed, and therefore 
changes in 5- lead to very small changes in However, at some point the compensation starts 
to fail signihcantly and the increases dramatically. Hence, this certain degree of redundancy 
between and A_ leads to a severe breaking of the quadratic behaviour in Ay^. Essentially 
the redundancy between the parameters leads to a very flat direction in the eigenvalue space (a 
very large/small eigenvalue of the covariance/Hessian matrix) which means that cubic, quartic 
etc. terms dominate. During the process of diagonalization this bad behaviour feeds through 
into the whole set of eigenvectors to a certain extent. 

Therefore, in order that the Hessian method work at all well we have to eliminate the largest 
eigenvalues of the covariance matrix, i.e., remove the redundancy from the input parameters. 
In order to do this we simply £x some of the parameters at their best fit values so that the 
Hessian matrix only depends on a subset of parameters that are sufficiently independent that 
the quadratic approximation is reasonable. In fact we finish up with 15 free parameters in total 
~ 3 for each of the 5 different types of input parton. In particular, fixing the other parameters 
at the best fit values we find that 77 ^, 5g and 5- are sufficient for the gluon - one for high x, 
one for medium x and one for low x. However, we emphasize that we cannot simply set the 
other parameters to zero. For example A_ must be of a size as to allow a sufficiently negative 
input gluon at low x with a sensible value of 6-, but we cannot allow it to vary simultaneously 
with (5_. We could possibly allow one or two more parameters to be free, but judge that the 
deterioration in the quality of the quadratic approximation does not outweigh the improvements 
due to increased flexibility in the parton variations. We note that this problem seems to be a 
feature of the full global fits obtained by CTEQ and MRST, and that the other fitting groups 
have not yet needed to introduce enough parameters to notice such redundancy. It has clearly 
been noticed by CTEQ though, since in [4] they only have 16 free parameters out of a possible 
22, and in [5], where they use a significantly altered type of parameterization, they have only 
20 free parameters out of a possible 26. 

Hence, we produce 30 sets of parton distributions labeled by to go along with the central 
best £t; that is 15 “+” sets corresponding to each eigenvector direction, and 15 sets^. Even 

®In order to produce the errors on the parton distributions a higher numerical accuracy was required than 
that used when we previously found just the “best fit”. This results in the partons from the central fit being 
very slightly different to the standard MRST2001 partons, and we label them by MRST2001C. In fact some of 
the input parameters are quite different to those in the MRST2001 default, but the partons themselves differ 
by fractions of a percent. This is an example of the redundancy in some input parameters noted above. The 31 
parton sets (S^, MRST2001C) are available at http://durpdg.dur.ac.uk/hepdata/mrs. 
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though we have limited the number of free parameters in the calculation of the Hessian matrix, 
we note that we still have significant departure from the ideal quadratic behaviour. For the 
10 or so lowest eigenvalues of the covariance matrix the quadratic approximation is very good 
“ the distance needed to go along one of the Zi to produce = 50 being the expected 
v^50 = 7.07 to good accuracy in both “+” and directions. However, for 4 or 5 of the largest 
eigenvalues of the covariance matrix, corresponding mainly to the large-x d quark, large-x gluon 
and u — d distributions, the absolute scaling and symmetry break down somewhat. In the very 
worst case of the largest eigenvalue, the scale factors to produce = 50 are 9.5 and 4.5 
in the two opposite directions. In order to produce the sets corresponding to Ay^ = 50 we 
have to multiply the parton deviations required for Ay^ = 1 by these scale factors rather than 
the expected 7.07. (In fact we do this for all 30 sets, but in most cases the scale factor is in 
the range 6.5-7.5.) Hence, as in [4, 5], this necessitates the snpply of both “+” and sets, 
whereas in the quadratic approximation one could easily be obtained from the other. Indeed 
from Fig. 9 of [4] it is clear that CTEQ enconnter a breakdown of the quadratic behaviour of 
much the same type that we do. 

Using the 30 parton sets corresponding to the 15 eigenvector directions for variations of 
the partons about the minimum one can use Eq. (7) to calculate the error for any qnantity, 
assnming an allowed Ay^ = 50. In fact it has been proposed [32] that one may also account 
for some of the asymmetry due to departures from quadratic behaviour by replacing Eq. (9) by 
the slightly more sophisticated form 


(AF)+ 

(AF)_ 


ELi {me.xiFiS^)-F{St),FiSj:)-FiSt),0)yy 

1 

ELi (ma.x(f (S») - F(S+), F(S«) - F(S,7), 0))T , 


(13) 


where represents the best £t set of partons. In [32] and [33] examples are discussed where 
the use of Eq. (13) instead of Eq. (9) leads to not only an asymmetric error, bnt also a larger 
uncertainty overall. We hnd only fairly minor effects, with no real evidence that Eq. (13) leads 
to markedly more reliable results than Eq. (9), so we use the simpler Eq. (9) in this paper. 

As an example of the use of the Hessian method we show in Figs. 1-4 the uncertainty on 
some of the parton distributions at various values of namely the uy distribution, the dy 
distribntion and the gluon distribntion respectively. As one sees, the uy distribution is very well 
determined, and the nncertainty shrinks slightly with increasing Q^. The lowest nncertainty 
is in the region of a; = 0.2 where there are very accnrate data which mainly constrain the 
valence quarks. At lower x the direct constraint is on the sum of valence and sea quarks. The 
dy distribution is also well determined in general, but is rather more uncertain as we go to 
the highest x values. The gluon distribntion is known less well, but at the highest has an 
uncertainty of as little as 5% for x ~ 0.05 where it is constrained by both dF 2 {x,Q‘^)/d\nQ‘^ 
of the HERA data and the lowest-E'^ Tevatron jet data. It becomes very nncertain for x > 0.4 
where only the relatively imprecise highest-E^ jet data provide any information. The fractional 
uncertainty at very small x decreases very rapidly as increases because much of the small-a; 
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gluon at higher is generated from that at higher x via evolution. We also show the gluon at 

= 2 GeV^ explicitly in Fig. 4. At this low scale the central gluon is negative at a; = 0.0001, 
but we see that the gluon may be positive within the uncertainty. This just about persists 
if we go to as low as a; = 10“® at this Q^, but at our input scale Ql = 1 GeV^ the gluon 
would be negative for x less than 0.0005, outside the level of uncertainty chosen. Also shown 
on the plots are the GTEQ6M partons. For the dy distribution the agreement is excellent. For 
the uy distribution the agreement at a; > 0.05 is very good, but there is a discrepancy below 
this value. However, in this range, the valence quarks become very small indeed and the data 
only really constrain the total u distribution which is completely dominated by the sea. This 
apparent discrepancy is probably due to parameterization effects, and is irrelevant in practice. 
However, in Fig. 3 we see that the MRST and GTEQ gluons show a genuine and signihcant 
level of incompatibility. We will comment on this more in Section 8. 

One might worry that the hxing of some of the parameters, that determine the input 
parton distributions, will cast some doubt on the error obtained. However, we stressed that 
these are largely redundant parameters, and we have checked that the errors obtained (when 
using = 50) are indeed compatible with the errors obtained using more rigorous means, 
i.e., the Lagrange multiplier method, in the following sections.® Nevertheless, it is a sign of 
the breakdown of the quadratic approximation. Of more practical concern is the fact that 
this breakdown is also exhibited in a non-trivial manner in some of the eigenvectors used - 
particularly those eigenvectors associated with the least known partons, e.g. the high-a; down 
quark and gluon. The scaling has been designed to give correct results if = 50 is used. 
However, one cannot simply extrapolate to different choices of Ay^. Eor example if Ay^ = 25 
were deemed a more suitable choice, in principle the error would just be that using Eq. (7) 
divided by v^, but the breakdown of quadratic behaviour does not guarantee this, especially for 
some directions in parameter space. Also, if one wished to be very conservative in the estimation 
of an uncertainty, simple extrapolation cannot reveal when Ay^ might start to increase rapidly. 
We will see examples of this later. 

Also we note that we have performed this analysis for a hxed value of the coupling constant: 
as{Mz) = 0.119. One can in principle include this as another free parameter. Indeed we then 
hnd that the behaviour obeys the quadratic approximation qnite well and that Ay^ = 50 
gives an error of about ±0.003, corresponding well to onr error of ±0.002 obtained in [9] 
using Ay^ = 20. We will discuss extractions of as{M‘^) again in Section 8. However, for the 
Hessian approach there is a slight difference between varying as{M‘^) and varying the parton 
parameters. When is hxed the maximnm error on any quantity is obtained from 

®We have checked the effects of using Eq. (13) instead of Eq. (9) in these comparisons. In all cases the former 
only introduced a relatively small asymmetry in the uncertainty, with the average being very close indeed to the 
result using the latter. Also, the asymmetry was of the same sign as that found using the Lagrangian approach 
only half the time, i.e. the use of Eq. (13) did not reliably predict the direction of steeper increase of 
even when the asymmetry was quite large. We find this surprising, and have no good explanation. However, 
it illustrates the semi-qualitative nature of the Hessian approach compared to the more rigorous Lagrange 
Multiplier method. 



some linear combination of our different parton sets, and in principle one could reproduce the 
particular parton set which corresponds to this linear combination, which would be a perfectly 
well-dehned set itself. However, a linear combination of asiQ"^) coming from contributions 
with different as^M^) does not actually correspond to one particular choice of as^M^) (each 
contribution has a branch point at a different value of so a linear combination will have 
multiple branch points), so one cannot precisely dehne a particular set of partons corresponding 
to a particular 0 : 5 (M|) for the extreme. 

Hence, although the 30 parton sets obtained using the Hessian approach provide the most 
convenient framework for calculating the uncertainties on a physical observable, for the reasons 
described above we would also like to study an alternative approach, partially just to check how 
well our adapted Hessian approach really works. A more robust method, which also allows us 
to directly investigate the partons, and as-, corresponding to the extreme variations of a given 
physical quantity is the Lagrange multiplier method. We study this in detail below. 


3 Lagrange multiplier method 

It is much more rigorous to investigate the allowed variation of a specihc observable by using 
the Lagrange multiplier method. This was also one of the approaches used by the CTEQ 
collaboration [ 8 ]. In this, one performs a series of global hts while constraining the values a* 
of one, or more, physical quantities in the neighbourhood of their values af obtained in the 
unconstrained global £t. To be precise, we minimize the function 

T(Ai,a) = x|obai(a)+ (14) 

i 

with respect to the usual set a of parameters, which specify the parton distributions and the 
coupling asiM"^). This global minimization is repeated for many fixed values of the Lagrange 
multipliers Aj. At the minima, with the lowest 'L(Ai,a), the observables have the values ai{a) 
and the value of Xgiobai(®) is the minimum for these particular values of a*. These optimum 
parameter sets a depend on the fixed values of A*. Clearly, when Aj = 0, we have T = 
Xgiobai = Xo and (Tj = (T°. In this way we are able to explore how the global description of 
the data deteriorates as the ai{a) move away from the unconstrained best fit values a®. Thus 
by spanning a range of A* we obtain the Xgiobai profile for a range of values of Ci about the 
best fit values, In this study we take the best fit values corresponding to the MRST2001 
partons [9]. 

This procedure involves none of the approximations involved in the Hessian approach. We 
can use the full set of parameters in the fit, obtaining maximum flexibility in the partons without 
having to worry about the large correlations or anticorrelations between some parameters. We 
never make any assumption about quadratic dependence on the parameters, and indeed, by 
using different values of the Lagrange multipliers, we can map out precisely how the quadratic 
approximation breaks down in the uncertainty for any physical quantity. Also, one produces 
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a particular set of partons with a particular value of as{M^) at every point in the space of 
cross-sections for the physical quantities mapped, so the interpretation of the extremes is more 
obvious and natural. Hence, in principle, this is a far superior method of obtaining uncertainties 
to the Hessian approach. However, it suffers from the large practical disadvantage that a series 
of global hts must be done every time one considers a new quantity. As examples we investigate 
a number of interesting physical cases below. 


4 The charged-current structure functions F 2 ^(e^p) 

The contour plot for the variation of Fp^{e^p) and Fp'"{e~p) about their predicted values 
from the unconstrained global £t is shown in Fig. 5, where we allow as to be a free parameter 
(unstarred labels) or fix it at the best fit value of asi^Mp) = 0.119 (starred labels). We show 
the contours for = 50, 100, etc. Overall, the ellipses one would expect from the quadratic 
approximation for Ay^ in Section 2 are more or less what one sees, but there is a certain 
asymmetry in that increases rather more rapidly for an increase in both Fp'"{e'^p) and 
Fp'"{e~p) than for a corresponding decrease in both. 

Thus, from Fig. 5, we see that the uncertainties of the Fp^{e~^p) and Fp'"{e~p) structure 
functions at a; = 0.5 and = 10, 000 GeV^ (due to the experimental errors on the data in the 
global fit) are about ^ ±2% respectively. In comparison, the values using the Hessian 

approach are ±10% and ±2% respectively, in good agreement, although slightly smaller for 
Fp^{e~^p). At this value of x the uncertainties in Fp^{e~^p) and Fp^[e~p) have a particularly 
simple interpretation since Fp'"{e^p) is almost exactly proportional to the valence down quark 
distribution, dy, and Fp^ {e~p) is almost exactly proportional to the uy distribution. This 
can clearly be seen in Fig. 6, which shows the u and d distributions for the extreme sets 
(T*,U*,V* and W*) corresponding to maximum and minimum Fp^[e^p) and Fp^[e~p) (for 
the case of fixed asiMp)). Rather obviously the d distribution maximises at large x for the 
case of maximum Fp'^^e'^p) and minimises for minimum Fp^[e^p), with similar behaviour 
for the u distribution and Fp^'{e~p). Note however that in each case the extreme in the 
parton distribution is not precisely at a; = 0.5, but at slightly higher x, where the data are less 
constraining. There are also sum rules on the partons which must be satisfied. It is also clear 
that there is a strong inverse correlation between the u and d distributions. This is because 
the data which constrain the relevant partons are the structure function measurements F 2 {lp), 
F 2 {ld) and F 2 {X){i'ij))p) which are essentially proportional to 4u±d, u + d and u + d respectively, 
where u ~ 4d at a; = 0.5. This constrains u far more than d as we have seen, but means that 
for maximum variation in the partons a change in u must be compensated by a much larger 
opposite change in d. The result that the major axis of the ellipse for given change in Ay^ 
is approximately aligned along 8Fp^\e~^p) — Fp^{e~p) (i.e., 8d — u) is therefore not at all 
surprising. The rate of quickest increase in x^ is then along 8d ± u, where the changes in the 
partons add in such a way as to maximise changes in the measured structure functions. 
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We see that allowing as^M"^) to also vary allows the error ellipses to grow slightly, mainly 
in width. Now the maximum and minimum allowed values of F^^{e~p) (or u) correspond to 
as{M‘^) = 0.117 and 0.120 and to parton sets T and V respectively. Most of the constraining 
data are for 10,000 GeV^, and must be well £t, but smaller 0:5 means slower evolution 

of the quarks and thus greater values of u and F 2 '"{e~p) at = 10,000 GeV^. Opposite 
considerations lead to the maximum F 2 ^{e~p). Since the extrema of F 2 ^{e~^p) and d are 
more involved, due to the negative correlation with the u distribution, they are less altered 
by allowing 0:5 to vary; see sets U and W. We see that the axes of the ellipse are essentially 
unchanged when 0:5 is left free. Thus Fig. 6 is much the same except that the variations for 
parton sets T and V are a little greater than for T* and V*. 

It is, of course, the fixed target data which constrain these cross-sections and the high-x 
quarks. It is very largely the BGDMS F 2 {ed) measurements which are responsible for the upper 
extremum in F^'^ie'^p)- The best fit tends to overshoot these data in the region of a; = 0.5, 
and a large increase in d makes the fit to these measurements very poor. For the extrema in 
F 2 ^{e~p) and u, the deterioration is more evenly spread over pretty much all the fixed target 
data at a; ~ 0.5 (with the exception that the description of the BGDMS 7^2 (d) measurements 
improves slightly), but the cumulative result is a very poor fit. One of the worst instances of 
deterioration is for the NMG F 2 {n)/F 2 {p) ratio. 


5 W and H production at the LHC and Tevatron 

The contour plot for the variation of aw and an about their predicted values at the LHG 
energy from the unconstrained global £t is shown in Fig. 7, where again we allow as to be a 
free parameter or fix it at asi^M"^) = 0.119. Again we show the contours for = 50, 100, etc. 
This time the Hessian approach should work well, although the ellipses start becoming a little 
rectangular. Allowing 0 : 5 (M|) to vary, we see that the uncertainties of the W and 77 cross- 
sections at the LHG (due to the experimental errors on the data in the global fit) are about 
^ 2 ‘q % and ±3% respectively, and are positively correlated. 

Again this analysis also gives information on the uncertainties of particular parton distribu¬ 
tions. To be specific, the parton sets which correspond to the points A,B,G,D, on the Ay^ = 50 
contour in Fig. 7, give the uncertainties in the parton distributions that dominantly determine 
aw and an in the kinematic domain x ~ 0.005, ~ 10^ GeV^ relevant to W and 77 production 

at the LHG. The extrema in aw, represented by A and G, correspond to variations in the sea 
quark distributions, while the extrema in an, represented by B and D, correspond to variations 
in the gluon distribution and as^M^). The values of 0:5 for sets A and G are 0.119 and 0.118 
respectively, both very close to the default MRST2001 value, showing that aw, which begins 
at zeroth order, is insensitive to 0:5. However, the values of as for fits B and D are 0.120 and 
0.117 respectively, reflecting the fact that an oc q;|. This is well illustrated by repeating the 
entire analysis with as fixed at the default value (0.119) obtained in the unconstrained global 
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fit [9]. The = 50 and 100 contours for this additional analysis are shown by the smaller 
shapes in Fig. 7. We can see that the uncertainty on is almost unchanged, while that for 
aH is reduced to about ±2%. The corresponding values using the Hessian approach are ±1.8% 
and ± 1 . 8 %, in good agreement but slightly smaller in each case. 

The up quark distribution for each ‘extrema’ set with hxed 0:5 (M|) is shown in Fig. 8 (a) 
and the gluon distribution in Fig. 8 (b). We see that indeed the parton distributions do reflect 
the extrema in the cross-sections in a fairly simple manner. The quark densities at high x show 
almost no variation between hts since they are well constrained at high x and because the W 
and H production cross-sections are sensitive to the partons at an x range centered at a few 
xl0“^. Indeed the maximum and minimum W cross-sections correspond to the maximum and 
minimum sea quarks for x < 0.05 at ~ 10,000 GeV^. The maximum and minimum Higgs 
cross-sections correspond to the maximum and minimum gluon distributions in the same sort 
of range, although the large x gluon must now decrease for increases in the small x partons, 
and vice versa, in order to maintain the momentum sum rule. The strong correlation between 
the two cross-sections is due to the fact that at high the size of the quark distribution at 
small X is mainly determined by evolution, and the larger the small x gluon the stronger the 
quark evolution (and vice versa). When 0:5 is left free the resulting partons at the extrema are 
similar to the fixed 0:5 results. However, in this case, their variation is a little larger at smaller 
Q^, since the slight changes in 0:5 lead to different rates of evolution. 

For the case of hxed 0:5 the main contributions to come from the HERA small-a; 
structure function data and, because of the changes in the high x gluon, also from the Tevatron 
jet data. For the upper extrema in uw and an the slope dF 2 {x,Q‘^)/d\nQ‘^ tends to be too 
large for x < 0.001, while for the lower extrema the slope is too small. In both cases the ht to 
jet data deteriorates due to the shape of the high-a; gluon becoming wrong. When 0:5 (M|) is 
allowed to vary the data which are particularly sensitive to this also play a role, for example 
the BCDMS data are htted less well when 0 : 5 (M|) = 0.120 in £t B, and the NMC data are 
described less well when as^M^) = 0.117 in fit D. 

The corresponding Ay^ contour plot for the Tevatron is shown in Fig. 9, where again we 
either allow as to be a free parameter or fix it at as{M'^) = 0.119. We see that the uncertainty 
of the W cross-section at the Tevatron (due to the experimental errors on the data in the global 
ht) has decreased to about ±1.5% while that for the Higgs has increased to about ± 8 % for 
varying q; 5 (M|), and that the correlation has disappeared. For as^M"^) fixed at 0.119 aw is 
again largely unaffected, but the uncertainty of an now more than halves to about ^ 4 ^ 5 % , 
reflecting the fact that this time the maximum and minimum Higgs cross-sections for variable 
0:5 correspond to as{Ml) = 0.1215 and as{Ml) = 0.116 respectively. With as{Ml) fixed 
there is now even a very slight anti-correlation between the cross-sections. 

The extrema in aw, represented by P and R, correspond roughly to variations in the quark 
distributions at a; ~ 0.04, while the extrema in an, represented by Q and S, correspond to 
variations in the gluon distribution at a; ~ 0.06 and as^M^). The values of x sampled at 
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the Tevatron are an order of magnitude greater than at the LHC. This, coupled with the fact 
that it is a proton-antiproton collider, rather than a proton-proton collider, complicates the 
interpretation of the extremes of the cross-sections in terms of partons. 

The up quark distribution for each extrema set with hxed as{M‘^) is shown in Fig. 10(a) and 
the gluon distribution in Fig. 10(b). The corresponding distributions obtained when as{M‘^) 
is allowed to vary are shown in Fig. 11. We hrst consider the cases of the maximum and 
minimum W cross-sections, which are insensitive to whether as is left to vary or not. For 
discussion purposes, let us consider only the u and d quark flavour contributions. Then the W 
cross-section is roughly proportional to 

u{xi)d{x2) + d{xi)u{x2) -h u{xi)d{x2) + d{xi)u{x2) (15) 

where 1 refers to the proton and 2 to the antiproton and xiX 2 = M'^/s. Hence the average 
value of Xi = 0.04. This is sufficiently large that there is a distinct difference between the 
quark and antiquark distributions, and the contribution to the cross-section from the quark 
contribution is the greater. Hence, one can decrease the cross-section by replacing a quark by 
its antiquark at a; = 0.05, or vice versa. Of course, there is a fundamental constraint in doing 
this due to the sum rule for each valence quark. However, the only real experimental constraint 
is from the CCFR F^{x,Q‘^) data, all other structure function data being insensitive to the 
distinction between quark and antiquark. In the optimum global £t most data would like there 
to be more quarks at high x, while the CCFR F^{x, Q^) data would prefer more valence quarks 
at X < 0.1. This leads to a compromise where for the best fit the CCFR Fs{x, Q‘^) data at low 
X are undershot. The minimum aw is therefore achieved mainly by this exchange of quark for 
antiquark, which most data are happy with, and hence the deterioration in at P (and P*) is 
almost entirely from the description of the CCFR F^{x,Q‘^) data. Hence, both the gluon and 
quark distribution for P (and P*) are hardly changed, as seen in Figs. 10 and 11, but u — u 
and d — d decrease for x ~ 0.05. Going in the other direction, an increase in ^^(0.05) and 
the consequent decrease in the valence quarks at higher x causes a large penalty in aiid the 
maximum aw is achieved in a different manner. At a; ~ 0.05 the quark evolves much more 
slowly than at a; ~ 0.05 and the density at ~ 10, 000 GeV^ is determined largely by the 
input value, and modified by the rate of evolution. Hence the maximum aw is achieved by 
having a large quark distribution at a; ~ 0.05 at low and also by having an enhanced gluon 
at a; ~ 0.05 to increase evolution. These are displayed in Figs. 10 and 11. The deterioration 
in then comes mostly from the low quarks causing an overshooting of NMC structure 
function data, but there is also a contribution due to the enhanced gluon at a; ~ 0.1 causing it 
to be smaller for x > 0.1 and hence htting the Tevatron jet data less well. 

The extrema of the Higgs cross-section are also slightly complicated. It is not possible to 
simply increase or decrease the gluon in a range centered on a; ~ 0.05 because this is precisely 
the X region where the majority of the gluon’s momentum is carried, and this total is very 
well constrained by the momentum sum rule and the accurate high x quark determination. 
Therefore, for fixed as{M‘^) the change in an is largely reliant on the fact that this total cross- 
section actually probes quarks within about an order of magnitude either side of the central 
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production value of a; = Mh/^/s. Hence, as we see from Fig. 10 the maximum cross-section is 
obtained from the gluon in set Q* which is slightly reduced for x < 0.04 and more enhanced for 
X > 0.04 and the minimum cross-section is obtained from the gluon in set S* which is slightly 
increased for x < 0.04 and more reduced for x > 0.04. In both cases those data sets sensitive 
to the small x and large x gluon, i.e., HERA structure function data and Tevatron jet data 
respectively, are those for which the description deteriorates. When as^M^) is allowed to go 
free it varies by about ±0.003 and there is a large increase in the variation of an- This is not 
only because cth cc q ;| but also because the HERA data anti-correlate as and the small x gluon. 
Therefore, in set Q, for example, the increased value of asiM"^) allows the small x gluon to 
get much smaller, and the high x gluon much larger, compared to set Q*. This compensation 
between as and the small x gluon also means that HERA data remains well £t, and it is the jet 
data (particularly CDE), sensitive to large x, and the large a^-phobic BCDMS data, for which 
the description deteriorates. Similar considerations apply to set S as compared to S*. Here it 
is the DO jet data and the small as'-phobic SLAG and NMC data that are badly fit. 

For Ax^ = 50 the Hessian approach gives an uncertainty of ±1.2% for aw and ±3% for an, 
at the Tevatron energy. In simplistic terms this is in good agreement, but a little smaller for 
the gluon-sensitive Higgs cross-section. However, in this case we see from Fig. 9 a very marked 
asymmetry on the contonr plot. For fixed as{M'^) the ellipses are certainly not centered on 
the best fit values, and for varying as{M‘^) we see that is clearly increasing far more rapidly 
for increases in the predicted W cross-section than for corresponding decreases. Thus, it is 
clear that within the framework of this fit, increases of the cross-section of much more than 
3% are completely ruled out, whereas decreases of the same amount are mnch more acceptable. 
This information would be largely lost in the Hessian approach, and for these quantities the 
Lagrange multiplier method does supply some important additional information. 


6 The ratio of W to production at the LHC 

The ratio of the W~ to the production cross-sections at hadron colliders is a particularly 
interesting observable. The measurement is expected to be quite precise (better than ±1% at 
the LHC, see e.g. [34]), since many of the experimental nncertainties cancel in the ratio. The 
uncertainty in the prediction of the ratio at the LHC can be deduced from the Ay^ profile 
shown in Fig. 12. Taking, as before, the Ay^ = 50 measure, we obtain A(W~/W^) = ±1.3%, 
and the Hessian approach is in very good agreement with this. Since the W~ jW^ ratio is 
sensitive to the ratio of the d and u quark distributions, it is not snrprising that the increase 
in y^ is almost entirely due to the NMC F 2 {n)/F- 2 {p) data [25]. 

A detailed discussion of the W~/W^ ratio may be found in Ref. [35]. Consider, for instance, 
the ratio as a function of the W rapidity y 

da/dy{W~) ^ d{xi)u{x 2 ) ^ d{xi) 
da / dy(W^) u{xi)d{x 2 ) u(a;i)’ 
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where Xi = /^/s = 0.0057e^ at the LHC. In Eq. (16) we have ignored, for simplicity, the 

contributions involving strange and heavier quarks. Thus a measurement of the ratio at large y 
would provide a direct determination of d/w at large x. For example, for y ~ 4, we measure d/u 
at a; ~ 0.3 at the LHC. Of course, it is the decay lepton rapidity that is measured, rather than 
the parent W rapidity, and so the ratio in a given rapidity bin will have a greater uncertainty 
than that for a{W~)/a{W^). 


7 The moments of the (u—d) distribution 

The parton distribution functions of the nucleon are fundamental quantities that should, in 
principle, be calculable from hrst principles in QCD. In particular, the x moments of parton 
distributions at a given scale are related, by the operator product expansion, to a product 
of perturbatively calculable Wilson coefficients and non-perturbative matrix elements of quark 
and gluon operators. The latter can be computed using lattice QCD and, indeed, in recent 
years the precision of the lattice calculations has improved significantly. Although in principle 
the lattice results can be related to moments of physical structure functions, in practice it is 
more efficient to use parton distributions determined in a global ht to represent the physical 
‘data’. Comparisons of recent lattice moment calculations with the predictions of earlier MRS 
parton distributions are encouraging, see for example [13, 14]. 

In order to quantify the agreement between the lattice calculations and the parton dis¬ 
tribution predictions it is obviously important to know the uncertainties in the latter. It is 
straightforward to apply the Lagrange multiplier method used in previous sections to deter¬ 
mine the uncertainties in observable cross-sections to the moments of parton distributions. 

To avoid contamination from gluon contributions, the lattice calculations focus on the mo¬ 
ments of non-singlet quark operators. For example, lattice results are available for the first 
three moments of the combination u — d, i.e., 

= [ dx x^~^ [u{x,Q‘^) - d{x,Q‘^)] (17) 

JO 

with N = 2,3,4. The predictions of the MRST2001 set (at = A GeV^) for these moments 
are given in Table 1. 

The contour plot for the (percentage) variation of the second and third moments about 
their predicted values is shown in Fig. 13. We again show the = 50 and 100 contours 
corresponding to the fixed as analysis, but there is evidently little difference between the fixed 
and variable coupling results in this case. 

As expected, there is a strong positive correlation between the two moments. Using the 
Ay^ = 50, varying 0:5 criterion for dehning a conservative error, we obtain errors of ±4.2%, 
±4.8% and ±5.0% for the second, third and fourth moments respectively. The corresponding 
predictions for the errors on the moments are also given in Table 1. The increasing relative 
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error with increasing moment is to be expected - higher moments probe the x ^ 1 region where 
there are fewer DIS structure function data. Again we notice that there is a small asymmetry 
in the contours - the increase in when both moments increase being less severe than when 
both moments decrease. 


N 

GeV 2 ) 

% error 

2 

0.1655(70) 

4.2 

3 

0.0544(26) 

4.8 

4 

0.0232(12) 

5.0 


Table 1: The moments and their errors of the {u-d) distribution, Eq. (17), predicted at = 
4 GeV^ using MRST2001 partons [9]. 


The uncertainties on the moments using the Lagrange multiplier method with a fixed ag 
are slightly smaller: ±4.1%, ±4.3% and ±4.7% for the second, third and fourth moments 
respectively. These results are in excellent agreement with the (hxed 0 : 5 ) Hessian approach, 
where the corresponding errors are ±3.9%, ±4.3% and ±4.6%. 

Since, as we have already seen in Section 4, the u quark at high x is far more constrained 
than the d quark, the allowed variation in these moments is mainly due to variations in the 
dy distribution. The minimum extremum (H in Fig. 13) of the moments is therefore due to 
the largest allowed dy distribution at high x and arises from a similar set of partons to those 
for the maximum Thus, as in this previous case, it is mainly the comparison to 

the BCDMS F 2 (ed) measurements which causes the deterioration in the quality of the ht. The 
maximum of the moments (G in Fig. 13) corresponds roughly to the minimum dy distribution 
at high X and it is largely the ht to the F 2 {n)/F 2 {p) ratio that breaks down. 

For a number of years, lattice QGD has been used to calculate the moments of nucleon 
structure functions from hrst principles. The most recent comprehensive results are from the 
LHPG-SESAM [13] and QGDSF [14] collaborations. Although the comparisons with experi¬ 
ment (via parton distributions obtained from global hts) are encouraging, there are still many 
problems to be overcome, for example hnite lattice spacing and volume effects, renormalization 
and mixing of operators, unquenching and chiral extrapolation to physical quark masses. A 
comparison with the recent lattice results [13, 14] and the above MRST2001 moment predic¬ 
tions reveals that (a) the errors in the latter are at present signihcantly smaller than in the 
former, especially for the higher moments, and (b) the lattice results for the moments are sys¬ 
tematically higher. The explanation appears to be that the linear chiral extrapolation used in 
the lattice determinations is not valid - non-perturbative long-distance effects in the nucleon 
gives rise to nonlinear, non-analytic dependence on rriq [36]-[40] which is particularly important 
at small niq. In the most recent analyses (see for example the comprehensive study in [41]), 
the experimental (i.e., pdf) values for the moments are used to constrain a priori unknown 
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non-perturbative parameters which enter in the non-analytic terms in the chiral extrapolation 
formula. It will be interesting to investigate the effect of using the new MRST2001 moment 
predictions and errors in such studies. 


8 Comparison between different central parton sets 

So far in this paper we have investigated the uncertainty on physical quantities coming from the 
experimental error of the measurements used to determine the parton distributions. We have 
discussed both the Hessian and Lagrange Multiplier approaches, concluding that the latter is in 
principle preferable, but recognizing the practical advantages of the former. We have compared 
the results each provide for the uncertainties using the = 50 criterion, noting that they 
are generally in good agreement. The Hessian approach does tend to give slightly smaller 
uncertainties for the quantities sensitive to the least well-determined partons, i.e., an which is 
sensitive to the gluon distribution and [e^p) which is sensitive to the high-a; down quark 
distribution. This is probably partly due to the neglected effect of the not entirely redundant 
parameters, and partly due to errors associated with those eigenvectors which do not respect 
the quadratic approximation for too well, which indeed are mainly concerned with the 
gluon and high x down quark. However, the discrepancy is quite small, and we judge that we 
can trust the Hessian approach, at least for Ay^ in the region of 50 or less, to give quantitative 
results. Hence, for hxed = 0.119, we have made available 30 parton sets corresponding 

to the 15 different eigenvector directions in the space of variation of parton parameters away 
from their values at the minimum of the global £t, each set corresponding to an increase in 

of 50. These can easily be used to obtain the error on any physical quantity, as outlined 
in Section 2. We have also made available various parton sets with fixed and varying as{M‘^) 
corresponding to extreme variations in the predictions for various important cross-sections and 
other relevant observables. 

We note that the uncertainties obtained due to the errors on the experimental data are 
generally very small, of the order of 1 — 5%, except for quantities sensitive to the high-a; down 
quark and gluon, where they can approach 10%. However, in all of this we have implicitly 
assumed that the theoretical procedure is precisely compatible with the data used, we have 
not considered the uncertainties due to (i) the data sets chosen, (ii) the choice of starting 
parameterizations, (iii) the heavy target corrections, etc. In practice this is far from true, 
as discussed in the Introduction. In this final section we acknowledge this to some extent 
and investigate qualitatively the impact of the initial assumptions going into the fit on the 
uncertainty on some quantities. In order to do this we first perform a slightly updated fit of 
our own (which includes minor modifications in terms of parameterization and the treatment 
of errors and data sets) so as to produce the best set of up-to-date partons. This was partially 
inspired by the question of why CTEQ6 [5] gives a much better fit to the Tevatron jet data than 
MRST2001, but also by the availability of new ZEUS data. We call the new set MRST2002 
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partons/ 

8.1 CTEQ6, MRST2001 and a new parton set (MRST2002) 

We found that we can improve the fits to jets within the global fit by a couple of modihcations. 
In order to obtain the best global £t with partons input at Qq = 1 GeV^ we had previously 
found that we needed a parameterization which allows the gluon to go negative at small x. 
Hence we used 


xg{x,Ql) = Ag{l — + egX^'^ +'ygx)x^‘’— A_{1 — x)^~x , (18) 

where ~ 0.2, 6- ~ 0.3 and ? 7 _ was fixed at ~ 10, so as not to affect the high x distribution. 
Unexpectedly, allowing to vary to ~ 25 resulted in a slight improvement in the fit to Tevatron 
jets. We also modified our treatment of the errors for the Drell-Yan data [28]. The fit to these 
data actually competes with that to the jets, and using only statistical errors, as in our previous 
studies (the systematic errors being defined a little vaguely), over-emphasizes the effect of the 
Drell-Yan measurements. Adding 5% systematic errors in quadrature to the statistical errors 
(which is probably the best approach [28]) also improves the fit to the jet data. Both these 
modifications appear appropriate and are implemented in our updated set. Also included in 
the new analysis is the new ZEUS high-Q^ data [42], which has little effect on the partons. 
The only significant change in the MRST2002 partons, compared to MRST2001 partons [9], is 
an increase in the gluon at high x, which we show in Fig. 14. The fit to the Tevatron jet data 
now has = 154/113 compared to = 170/113 for MRST2001, and the fit to The Drell-Yan 
data with 5% systematic errors has = 187/136. The quality of fit for all other data sets is 
almost identical to that for the MRST2001 partons. 

The CTEQ6 partons are very similar to the MRST2001 (and MRST2002) partons in most 
aspects. However, in this CTEQ analysis [5] a number of different choices are made about 
the way in which the fit is implemented, which leads mainly to a significantly different gluon 
distribution. These differences are: the development of a different type of parameterization 
for the partons, which allows for a different shape at very high x; CTEQ omit data below 

= 4 GeV^, compared to our choice of = 2 GeV^; they do not fit to some data sets used 
in [9], i.e., they omit SLAG and one HI high-Q^ set of F 2 data; they use 10% systematic errors 
(in quadrature) for Drell-Yan data; moreover, CTEQ have a positive-definite small-x gluon at 
their starting scale of Qq = 1.69 GeV^. They also use a massless charm prescription and there 
are various other minor differences as compared with MRST.® 

The CTEQ6 gluon is also shown in Fig. 14. Clearly MRST2002 has a similar high-x gluon 
to CTEQ6, both being larger than MRST2001. However, the MRST gluons are different from 
the CTEQ6 gluon at smaller x due to their freedom to have a negative input distribution, and 

^The MRST2002 parton set can be found at http://durpdg.dur.ac.uk/hepdata/mrs. 

®The way in which these different assumptions lead to an improved fit to the Tevatron jet data is outlined 
in [43]. 
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due to slight differences in the choice of data sets htted. The different assumptions made in 
obtaining the CTEQ partons, although they improve the quality of the jet £t, do not lead to 
the best ht when including the data sets omitted by CTEQ and the ht is not good at all for 
data with < 4 GeV^. Hence, within the context of trying to obtain as inclusive a global ht 
as possible using NLO QCD, we take MRST2002 to be the best set of parton distributions. 

8.2 Comparison of predictions for and for ajj 

The predictions for W and Higgs cross-sections using the different partons are shown in Fig. 15. 
Since MRST2002 only differs from MRST2001 in the high x gluon, to which these cross- 
sections are insensitive, the predictions for MRST2002 are very similar to those of MRST2001. 
(Hence our decision to keep MRST2001 partons as the base set for this paper). However, the 
corresponding predictions obtained using the CTEQ6 partons are quite different. At the LHC 
the prediction for aiv is similar, but ajj is towards the top of our (qualitative) 95% conhdence 
level. From Fig. 14 this is clearly due to the larger gluon in the x ~ 0.005 region, which is due 
to the positive dehnite input for the CTEQ6 gluon. At the Tevatron the discrepancy between 
CTEQ6 and MRST is even larger. The CTEQ6 predictions for both and a/f are effectively 
completely outside our expectations. The reason for the small prediction of an is evident from 
Fig. 14—the CTEQ6 gluon is considerably smaller in the region of a; = 0.1. This, in turn, 
is then responsible for a slower evolution of the quarks, making them smaller at high and 
hence making aw smaller. Presumably the difference comes about because CTEQ6 use a more 
restricted form of the gluon and omit one HI data set and < 4 GeV^ data which prefer larger 
dF 2 {x, Q^)/dlnQ‘^. Whatever the precise reasons for the discrepancies, it is clear that different 
choices for the overall framework of the global ht can completely outweigh the uncertainties 
due to errors on the data actually chosen to go into the ht. It would be easy to illustrate similar 
types of discrepancy comparing to other alternative sets of partons—in particular, due to the 
absence of the Tevatron jets in the hts, many of the parton sets in [l]-[7] have rather smaller 
gluons at large x, and would have diherent predictions for various quantities sensitive to the 
high-x gluon. 

8.3 Comparison of predictions for as{Ml) 

We also hnd a large variation in the values of asiM^) extracted from the hts of the diher¬ 
ent collaborations: GTEQ6 [5], ZEUS [7], MRST2001 [9], HI [6], Alekhin [3] and Giele et al. 
(GKK) [2]. The resulting values of asiM"^) are listed in Table 2, together with the determi¬ 
nation of this work (MRST2002), in order of decreasing tolerance which is rehected 

in the size of the corresponding experimental error. Not all are presented as determinations 
of as{M^), but all are extracted using the same criteria as for the uncertainty on partons in 
the respective ht, and hence should be as reliable. Glearly there is a very large variation, with 
some very low values. The uncertainties due to experimental errors are determined in diherent 
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Croup 

variation 

as(Ml) 


of y 2 


CTEQ 6 

Ay 2 = 100 

0.1165 ±0.0065(exp) 

ZEUS 

Aye\ = 50 

0.1166 ± 0.0049(exp) ± 0.0018(model) ±0.004(theory) 

MRST02 

Ay 2 = 20 

0.1195 ± 0.002(exp) ± 0.003(theory) 

MRSTOl 

Ay 2 = 20 

0.1190 ± 0.002(exp) ± 0.003(theory) 

HI 

Ay 2 = l 

0.115 ± 0.0017(exp) topoos (model) ±0.005(theory) 

Alekhin 

Ay 2 = l 

0.1171 ± 0.0015(exp) ± 0.0033(theory) 

CKK 

AXe\ = 1 

0.112 ± 0 . 001 (exp) 


Table 2: Values of as{M‘^) and its error from different NLO QCD fits. 


fashions in each case, and a summary can be found in [44], We use for the ZEUS determi¬ 
nation [7], because they use the offset method for determining uncertainties which for = 1 
gives a larger uncertainty than the more common Hessian method. ZEUS estimate that this 
is equivalent to Ay^ 50 if they were to use the same treatment of errors as CTEQ. We also 
use Aygg for the GKK value [2], because the uncertainties are obtained using conhdence limits, 
but the error quoted corresponds to the one sigma usually associated with Ay^ = 1 . 

The model errors incorporate such effects as the heavy quark prescription and masses, 
parameterizations, changes in the starting scale of evolution etc. The theory error is often 
determined by variation of renormalization and factorization scales, though MRST use an 
estimate appealing to current knowledge of NNLO and resummations, which we feel is more 
reliable. Since each fit is centered on NLO QCD with scales equal to Q^, the “theory errors” 
are very strongly correlated, and cannot therefore be responsible for the differences. These 
discrepancies are undoubtedly due to the assumptions going into the hts, mainly on which data 
sets are included and which cuts on and fU^ are used. 

MRST, who obtain the largest value of 0 : 5 (M|), use the widest range of data sets and also 
the least conservative cuts.® CTEQ use only a slightly smaller number of data sets but also 
cut data below = A CeV^, as described previously. They also use a dehnition of the NLO 
coupling which truncates the solution of the renormalization group equation, whereas most 

®The slightly different treatment in this work (MRST2002) leads to a marginal raising of as{M'^) as compared 
to MRST2001 [9], as seen in Table 2. We still use Ay^ = 20 for our one-sigma uncertainty, since if Ay^ = 50 
corresponds to 90% confidence level, or 1.65 sigma, simple scaling implies that one sigma corresponds to Ay^ = 
50/(1.65)^, i.e. Ay^ = 20 to a good approximation. 
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other groups use the full solution of the NLO equation. Both approaches are equally correct, 
but the truncation of the solution leads to a slightly higher value of as{Q^) at scales below 
M|, for the same value of q; 5 (M|), than the other method, and thus tends to yield a lower 
as{Mz). CTEQ also have a very conservative estimate of the error, though it is meant to be 
somewhat more than a one-sigma error. ZEUS and Alekhin use a similar selection of data sets, 
i.e., HERA DIS data (only ZEUS data in the former case) and a number of hxed target DIS 
data sets. Hence, it is unsurprising that they obtain similar central values of q; 5 (M|), with 
respective errors which are easily explained by their choices of Ay^. HI and GKK both use a 
small number of sets of data: the former collaboration uses HI DIS data [6, 10] and BCDMS 
hxed-target proton DIS data [22], while GKK use older HI DIS data [45] together with BGDMS 
and E665 [26] hxed-target proton DIS data. Both determinations are heavily inhnenced by the 
BGDMS proton data set which prefers rather smalh'^ and this feeds into the hnal 

values. Also, both are strict in their statistical interpretation, obtaining small uncertainties, 
even with relatively small data samples. Finally we note that only GTEQ and MRST include 
the Tevatron jet data in their analyses. This is relevant because of the as-gluon correlation. 

8.4 Final comment 

From the discussion of the previous two subsections, it is clear that different ideas about the 
best way to perform a NLO ht can lead to a wide variation in both the central values and 
the errors of as^M^) as well as in predictions for physical qnantities snch as aw and an- The 
fact that the various ‘NLO’ hts can yield such different ontputs is distnrbing, and is indicative 
of the nncertainty arising from theoretical assumptions. Indeed, we have always believed that 
‘theory’, rather than experiment, will provide the dominant sonrce of error [44]. We have 
already produced approximate NNLO parton distributions and predictions [47] (based on the 
approximate splitting fnnctions [48] obtained from the known NNLO moments [49]), and hnd, 
for example, that the NNLO W cross-section at the Tevatron is 4% higher than at NLO, and 
believe that this resnlt is reliable. This change is somewhat larger than the uncertainty due to 
experimental errors shown in Fig. 9. Moreover, W production is likely to be subject to smaller 
theoretical uncertainty than many other observables—particularly those directly related to the 
gluon. Onr estimates for the nncertainty in Fl{x,Q‘^) at small x are 10% or more even at 
= 10, 000 GeV^, and signihcantly larger at lower Q^, for example. Hence, an nnderstanding 
of theoretical uncertainties is clearly a priority at present, and a preliminary attempt at this 
will be the snbject of a fnture pnblication [50]. 
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Uncertainty of up valence quark from Hessian method 



-4 -3 -2 -1 

10 10 10 X 10 1 


Figure 1: The uncertainty on uv{x,Q'^) at = 5 GeV^ and 100 GeV^ obtained using the 
Hessian approach with = 50. Also shown is the GTEQ6M distribution. The uncertainties 
are shown relative to the MRST2001 set of partons [9]; the label G is explained in footnote 5. 


26 

























Uncertainty of down valence quark from Hessian method 



-4 -3 -2 -1 

10 10 10 X 10 1 


Figure 2: The uncertainty on dv{x,Q‘^) at = 2 GeV^ and 100 GeV^ obtained using the 
Hessian approach with = 50. Also shown is the GTEQ6M distribution. The uncertainties 
are shown relative to the MRST2001 set of partons [9]; the label G is explained in footnote 5. 
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Uncertainty of gluon from Hessian method 
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Figure 3: The uncertainty on g{x,Q'^) at = 5 GeV^ and 100 GeV^ obtained using the 
Hessian approach with = 50. Also shown is the GTEQ6M distribution. The uncertainties 
are shown relative to the MRST2001 set of partons [9]; the label G is explained in footnote 5. 
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Uncertainty of gluon from Hessian method 



Figure 4: The uncertainty on g(x,Q^) at = 2 GeV^ obtained using the Hessian approach 
with Ax^ = 50. Also shown is the CTEQ6M distribution. 
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Figure 5: Ax^ = 50,100,... contours, where Ax^ is the increase in x^ from the global 
MRST2001 minimum, obtained by performing new global hts with fixed at val¬ 

ues in the neighbourhood of their value in unconstrained MRST2001 £t. The Ax^ = 50 
contour is taken to represent the errors on (arising from the experimental errors on 

the data used in the global fit). The extrema sets of partons (T,U,...) are discussed in the 
text. The dashed contours are obtained if a 5 (M|) is allowed to vary. The superimposed solid 
Ax^ = 50,100 contours are obtained if as{M'^) is hxed at 0.119. 
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Uncertainty in u quark distribution (a^, fixed) 




Figure 6: The u and d quark distributions (at Q 
lie on the Ax^ = 50 contour of Fig. 5 for hxed ( 


Uncertainty in d quark distribution (a^, fixed) 
























Figure 7: Contours with Ax^ = 50,100... obtained by performing global fits with the values 
of aw and an, at the LHC energy, hxed in the neighbourhood of their values predicted by the 
unconstrained MRST2001 £t. The = 50 contour is taken to represent the errors on aw 
and an (arising from the experimental errors on the data used in the global £t). The extrema 
sets of partons (A,B...) are discussed in the text. The dashed contours are obtained if 
is allowed to vary. The superimposed solid = 50,100 contours are obtained if is 

hxed at 0.119. 
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Uncertainty in u quark distribution (a^, fixed) 


Uncertainty in gluon distribution (a^ fixed) 
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Ratio MRST(LHCA*,B *,C*,D'')/MRST200 1 
for u quark at = 10 GeV^ 
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Figure 8: The up quark and gluon distributions at = 10 and 10^ GeV^ in the extrema global 
fits on the = 50 contour of the (Tv(/,//(LHC) plot of Fig. 7 for as{Mz) hxed at 0.119. 
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Figure 9: As for Fig. 7, but for the Tevatron energy of ^/s = 1.8 TeV. 
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Uncertainty in u quark distribution (a^, fixed) 


Uncertainty in gluon distribution (a^ fixed) 
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Figure 10: The up quark and gluon distributions at = 10 and 10"^ GeV^ found in the extrema 
global hts on the Ax^ = 50 contour of the (Tvi/,//(Tevatron) plot of Fig. 9 with as^M^) fixed at 
0.119. 
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Uncertainty in u quark distribution (a^, free) 




Figure 11: As for Fig. 10 


Uncertainty in gluon distribution (a^, free) 
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for gluon at = 10“* GeV^ 
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with as(Ml) allowed to vary. 

























Variation of c(W‘)/c(W^) about MRST2001 value of 0.749 



Figure 12: The variation of obtained by performing global fits with a{W~)/(j(W^) hxed at 
different values in the neighbourhood of the value obtained in the unconstrained MRST2001 
£t. For = 50 we see that the uncertainty in the ratio is ±1.3%. 
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Figure 13: The contours obtained by performing global fits with the values of the N = 2 
and N = 3 moments of the u-d distribution hxed in the neighbourhood of their values predicted 
by the MRST2001 global £t. The dashed and solid curves correspond to fits with asiM^) 
varying and hxed respectively. 
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Figure 14: The CTEQ6 [5] and MRST2002 gluon compared to MRST2001 [9] gluon at Q 
and 10^ GeV2. 








increase in global analysis as the 
W and H cross sections are varied at the LHC 
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Figure 15: The CTEQ6 [5] and MRST2002 predictions of aw^ cth at the LHC and Tevatron 
energies, shown on the Ax^ contour plots centered on the MRST2001 partons [9]. The Ax^ 
contours are taken from Figs. 7 and 9 respectively, for the case in which is a free 

parameter. The inner contour with = 50 is taken to represent the error on the observables 
aw and an arising from the experimental errors of the data that are used in the global ht. 
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